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Abstract 

We consider the following model of decision-making by cognitive sys- 
tems. We present an algorithm - quantum-like representation algorithm 
(QLRA) - which provides a possibility to represent probabilistic data of 
any origin by complex probability amplitudes. Our conjecture is that 
cognitive systems developed the ability to use QLRA. They operate with 
complex probability amplitudes, mental wave functions. Since the mathe- 
matical formalism of QM describes as well (under some generalization) 
processing of such quantum-like (QL) mental states, the conventional 
quantum decision-making scheme can be used by the brain. We consider 
a modification of this scheme to describe decision-making in the pres- 
ence of two "incompatible" mental variables. Such a QL decision-making 
can be used in situations like Prisoners Dilemma (PD) as well as others 
corresponding to so called disjunction effect in psychology and cognitive 



1 Introduction 

Recently a new wave of interest to applications of the mathematical formalism 
of QM (especially its QI part) was generated via interactions of the quantum 
community with various research groups working in artificial intelligence [T], cog- 
nitive science and psychology [2j-[12], finances [13] -[21] and economy [22], [23] 
(cf. with [H]-[37]). In particular, an important project was started in [5]-[H], 
namely, creation of quantum-like (QL) models for decision-making by cognitive 
systems, see also [H],[T2]. Since QL- modelling of cognition has always been one 
of my favorable domains of research [2] -[5], I was happy to contribute to this 
project on decision- making by QL cognitive systems, see [38j . In this paper I 
shall combine the QL cognitive model [38^ with Bayesian statistical inference 
in the general framework of quantum decision-making, cf. e.g. , see e.g. |39] . 
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[3D]- [33], [33] (and references in these works). So, we shall proceed in the same 
direction as Busemeyer [6]- [8], La Mura [9], [TpJ and Franco [IT], [12]. 

We consider the following model of decision-making by cognitive systems. 
We present an algorithm - quantum-like representation algorithm (QLRA) - 
which provides a possibility to represent probabilistic data of any origin by 
complex probability amplitudes. Our conjecture is that cognitive systems de- 
veloped the ability to use QLRA. Thus they operate with complex probability 
amplitudes, mental wave functions. Since the mathematical formalism of QM 
describes as well (under some generalization, see appendix - section 9) process- 
ing of such QL mental states, the conventional quantum decision-making scheme 
can be used by the brain. We consider a modification of this scheme to describe 
decision-making in the presence of two "incompatible" mental variables. Such 
a QL decision-making can be used in situations like Prisoners Dilemma (PD) , 
see appendix (section 9), as well as others corresponding to so called disjunction 
effect in psychology and cognitive science, see e.g. [45]-[50], 

We start this paper with a short recollection of the QL representation of 
contexts which is based on QLRA, see [ST], [55] for detailed presentation. 

2 Contexts, observables, QL-representation 
2.1 Vaxjo contextual model 

Classical as well as quantum probabilistic models can be obtained as particular 
cases of our general contextual model, the Vaxjo model, see [51) . 

A physical, biological, social, mental, genetic, economic, or financial context 
C is a complex of corresponding conditions. Contexts are fundamental elements 
of any contextual statistical model0 

Thus construction of any probabilistic model M should be started with fix- 
ing the collection of contexts of this model. Denote the collection of contexts 
by the symbol C (so the family of contexts C is determined by the model M un- 
der consideration) . Another fundamental element of any contextual statistical 
model M is a set of observables O : each observable a £ O can be measured 
under each complex of conditions C £ C. For an observable a £ O, we denote 
the set of its possible values ("spectrum") by the symbol X a . We do not assume 
that all these observables can be measured simultaneously. To simplify consid- 
erations, we shall consider only discrete observables and, moreover, all concrete 
investigations will be performed for dichotomous observables. 

Axiom 1: For any observable a £ O and its value y £ X a , there is defined a 

1 In principle, the notion of context can be considered as a generalization of a widely used 
notion of preparation procedure, see e.g. 1531 . 154) . 1401 . However, identification of context with 
preparation procedure would restrict essentially our theory. In applications outside physics 
(e.g., in psychology and cognitive science) we will consider mental contexts. Such contexts 
are not simply preparation procedures. The same can be said about economical, political and 
social contexts. In this book we shall not provide a deeper formalization of the notion of 
context. In our model the notion of context is basic and irreducible. 
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context, say C y , corresponding to the y- selection^: if we perform a measurement 
of the observable a under the complex of physical conditions C y , then we obtain 
the value a = y with probability 1. We assume that the set of contexts C contains 
Cy-selection contexts for all observables a £ O and y £ X a . 

Axiom 2: Contextual (conditional) probabilities Pc(y) = P( a = y\C) are 
defined for any context C £ C and any observable a 6 0. 

Thus, for any context C £ C and any observable a £ O, there is defined the 
probability to observe the fixed value a = y under the complex of conditions 
C. Especially important role will be played by the "transition probabilities": 
p b \ a {x\y) = P(b = x\C y ),a,b £ 0,y £ X a ,x £ X b , where C y is the [a = y]- 
selection context. By axiom 2, for any context C £ C, the set of probabilities: 
{P(a = y\C) : a £ O} is well defined. We complete this probabilistic data for 
the context C by transition probabilities. The corresponding collection of data 
D(0,C) consists of contextual probabilities: P(a = y\C),P(b = x\C),P(b = 
x\C y ),P(a = y\C x ),..., where a,b,... £ O. Finally, we denote the family of 
probabilistic data D((D, C) for all contexts C £ C by the symbol T>(0,C). 

Definition 1. (Vaxjo Model) An observational contextual statistical model 
of reality is a triple M = (C,0,T>(0,C)), where C is a set of contexts and O 
is a set of observables which satisfy to axioms 1,2, and V((D,C) is probabilistic 
data about contexts C obtained with the aid of observables belonging O. 

We call observables belonging the set O = 0(M) reference of observables. 
Inside of a model M observables belonging to the set O give the only possible 
references about a context C £ C. 

Definition 2. Let a,b £ O. The observable a is said to be supplementary 
to the observable b if p b \ a (x\y) ^ 0, for all x £ Xb,y £ X a - 



2.2 Law of total probability and its violations 

We recall this law in the simplest case of dichotomous random variables, a = 
yi , j/2 and b = x\ , X2 , see e.g. [55] : 

P(b = x)= P{a = Vl )P{b = x\a = Vl ) + P(a = y 2 )P{b = x\a = y 2 ) (1) 

Thus the probability P(b — x) can be reconstructed on the basis of condi- 
tional probabilities P(b = x\a = y) and known or a priori chosen probabilities 
P(a — j/)0 This formula plays the fundamental role in modern science. Its 
consequences are strongly incorporated in modern scientific reasoning. In 3J- 
[51] it was pointed out that the quantum formalism induces a modification of 

2 See appendix - section 11 - for discussion of selection contexts and contextual forms of 
the von Neumann projection postulate. 

3 It might be better to call such observables complementary, but Bohr's complementarity 
was rigidly coupled with mutual exclusivity. Our supplementarity may be considered as a 
version of complementarity, but without mutual exclusivity, see |51| . 

4 "The prior probability to obtain the result e.g. b = x\ is equal to the prior expected 
value of the posterior probability of b = x\ under conditions a = y\,y2-" 
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this formula. An additional term appears in the right hand side of |T]), so called 

interference term. 

P{b = ar) = P(a = Vl )P{b = x\a = Vl ) + P(a = y 2 )P(b = x\a = y 2 ) (2) 

+2 cos 9y/P(a = yi )P{b = x\a = yi)P{a = y 2 )P{b = x\a = y 2 ). 

The main mathematical consequence of [3]-[51]is that any violation of the for- 
mula of total probability (which need not be coupled to quantum physics) in- 
duces its interference generalization. However, not any violation induces the 
ordinary cos-interference. For some contexts violation of (JTJ induces so called 
hyperbolic interference. But we shall not consider this type of interference in 
the present paper. 



3 Quantum-like representation algorithm — QLRA 

We consider two dichotomous supplementary reference observables a and b. In 
|51j we derived the following formula for interference of contextual probabilities 
for the general Vaxjo Model: 



p b c(x) = Y.p a c(y)p bla (x\y) + 2A K f[[p a c (y)p bla (x\y), (3) 

v V V 

where the coefficient of supplementarity (interference): 

x Pc(x)-E y Pc(y)p bla (x\y) 

A. = , (4) 

2 yJH y Pc(y)p bla (x\y) 

Contexts such that the interference coefficients X x ,x £ X&, are bounded by one 
are called trigonometric, because in this case we have the conventional formula 
of trigonometric interference: 



P b c {x) = Y,Pc{y)p bW (x\y) + i™sO x Jl[p a c (y)p bla (x\y), (5) 



where = cos& x . Parameters 6 X are said to be b\a- phases with respect to the 
context C. We defined these phases purely on the basis of probabilities. We 
have not started with any linear space; in contrast we shall define geometry 
from probability. 

We denote the collection of all trigonometric contexts by the symbol C tr . By 
using the elementary formula: D = A + B + 1\[AB cos 9 = \VA + e ie VB\ 2 , for 
real numbers A, B > 0,6 e [0, 2tt], we can represent the probability p c {x) as 
the square of the complex amplitude (Born's rule): 

p» c (x) = |Vc(z)| 2 ■ (6) 
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Here 



i>{x) = ip c (x) = ^P a c {yi)p bla ( x \yi) + e ^\/pc(y2)p bla (x\y2), x e x b . (7) 

The formula gives the QL representation algorithm - QLRA. For any trigono- 
metric context C by starting with the probabilistic data- p b j(x),p'^,(y),p b ^ a (x\y) 
- QLRA produces the complex amplitude ipc- This algorithm can be used in 
any domain of science to create the QL-representation of probabilistic data (for 
a special class of contexts). 

We denote the space of functions: ip : Xb — ► C by the symbol $ = $(Xb, C). 
Since X = {xi, x 2 }, the <E> is the two dimensional complex linear space. By using 
QLRA we construct the map 

jb\a . C tr _^ $(X,C) Which maps contexts (com- 
plexes of, e.g., physical conditions) into complex amplitudes. The representation 
( [6]) of probability is nothing other than the famous Born rule. The complex 
amplitude ipc{x) can be called a wave function of the complex of physical 
conditions (context) C or a (pure) state. We set e b (-) — 8{x — ■) - Dirac delta- 
functions concentrated in points x = x\,x 2 . The Born's rule for complex am- 
plitudes (|6|) can be rewritten in the following form: p b c {x) = \(ipc, e x)| 2 ) where 
the scalar product in the space $>(Xb, C) is defined by the standard formula: 
((f), ip) = J2xex b ^(x)^^)- The system of functions {e'jrg^ is an orthonormal 
basis in the Hilbert space H = (<&, (•,•)). 

Let Xb C R. By using the Hilbert space representation of the Born's rule 
we obtain the Hilbert space representation of the expectation of the observable 
b: E{b\C) = J2 x£Xb a#c(*)| 2 = Y, x£Xb x&C 4><^4> = (Hc^c), where 
the (self-adjoint) operator b : H — > H is determined by its eigenvectors: be b = 
xe x ,x G Xb - This is the multiplication operator in the space of complex functions 
$(Ab,C) : bijj(x) = xip(x). It is natural to represent the ^-observable (in the 
Hilbert space model) by the operator b. 

We would like to have Born's rule not only for the ^-variable, but also for 
the a- variable: p%(y) = \{i/>,e°)\ 2 ,y € X a . 

How can we define the basis {e°} corresponding to the a-observable? Such 
a basis can be found starting with interference of probabilities. We set u" — 

V I'r^llj )■/>•, =P bla (xj\yi),Uij = ^/plj,9j = Oc(xj). We have: 

= (8) 

where 

e; = (« U , u 12 ), e a y2 = (e^u 2U e i6 *u 22 ) (9) 

Suppose now that the matrix of transition probabilities P b ' a is doubly stochas- 
tic0 Under this condition the system {e°.} is an orthonormal basis iff the 
probabilistic phases satisfy the constraint: 9 2 — 6\ = tt mod 2tt. In this case 
the a-observable is also represented by a self-adjoint operator a which is di- 
agonal with eigenvalues 2/1,2/2 in the basis {e°.}. The conditional average of 

5 It is a square matrix of nonnegative real numbers, each of whose rows and columns sums 
to 1. Thus, a doubly stochastic matrix is both left stochastic and right stochastic. 
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the observable a coincides with the quantum Hilbert space average: E(a\C) = 

T, yeXa vPciv) = Wcic). 

In the general case (when P b ' a need not be doubly stochastic) the a-observable 
is represented as a generalized quantum observable (non self-adjoint operator), 
see appendix (section 10). We remark that statistical data obtained in cogni- 
tive psychology in experimental tests of disjunction effect produce non doubly 
stochastic matrices of transition probabilities [SO] , [JpJ . 

4 QL Decision-making scheme 

As we have seen, if for some context C, probability distributions for supplemen- 
tary observables a and b are known, then the complex probability amplitude 
tpc representing C can be reconstructed by using QLRA. This was the problem 
of representation of probabilistic data by complex probability amplitude, see 
section 3. My conjecture is that the brain developed the ability for such a QL 
representation of probabilistic data, see |38j for details. In such aq QL- model 
the brain uses complex probability amplitudes for decision-making. 

We consider the following situation. A (mental) context C is given. The 
brain must take decision about the 6-attribute, given by e.g. b = x\, x%, - so to 
choose between b = x\ and b = X2- The crucial point is that it is assumed that 
another attribute, say o(= y±, j/2), which is supplementary to b, is involved in the 
process of decision-making. Since variables a and b are supplementary (under 
the context C), interference angles 9 = {9 Xl ,8 X2 ) should be considered, see ([7]). 
In the PD , see appendix (section 10), this a-attribute is related to actions 
of another prisoner. In the gambling experiment it is simply the (classical) 
random generator producing wins and losses. The latter example shows that 
"quantumness" (qualitatively encoded by the interference angles) is not a feature 
of a (in fact neither of 6), but it appears via interrelation of a, b and the context 
C. Our scheme of QL decision-making is based on the assumptions that there 
are given (created by the brain of the basis of previous experience): 

a) transition probabilities p h \ a (x\y); 

b) the probability distribution of the a : Pc(y)j 

c) the probability distribution of the phase angles 9 = (8 Xl , 9 X2 ) : pc(9)- 

Thus all these distributions are given a priori. One should not always iden- 
tify prior probabilities with "subjective probabilities." The previous frequency 
experience plays an important role in determination of these probability distri- 
butions, cf. [56] . 

The brain uses the formula of total probability with the interference term 
to find the ^-probabilities. Under the assumption that the interference angle is 
9 X , it produces the probabilities 

Pc(x\0) =^^(2/)p b|a (a;|y) + 2cos^ y/p^(y 1 )pf , \ a (x\y 1 )p^(y 2 )pf>\ a (x\y 2 ). 
v 

The crucial point of the decision-making scheme is their interpretation: 
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For each x, p h c (x\0) is the probability that under the condition that the b\a- 
interference angle is 9 X (for the context C) the decision b = x is "right", i.e., it 
would produce some form of reward. 

By the (classical) Bayes' formula the brain finds the joint probability distri- 
bution: 

Pc (x,9) = pc{6)(J2p a c(y)p bla (x\y)+ 2cos0 * \fp a c(yi)p bla ( x \yi)p a c(y2)p bla ( x \y2)) 

V 

(10) 

and finally the total 6-probabilitics 

p b c (x) = J d6 p c {6) p b c (x\e). (11) 

As the extension of the interpretation of conditional probabilities, the probabil- 
ity Pq(x) is considered as the probability that the decision b = x is right. 

In the present decision-making scheme the brain makes the b = x\ -decision 
if Pq(xi) is larger than p b c (x2) an vice versa, cf. [10], p. 54. The qualitative 
meaning of "larger" is determined depending on the cognitive system and may 
be the context C. 

We should also mention another QL decision-making scheme. Comparing of 
the probabilities p b c {x\) and p b c {x2) is an additional act of mental processing. 
It needs special neuronal and time recourses. The processing might be espe- 
cially complicated when these probabilities do not differ essentially. In such a 
situation a QL cognitive system might choose the regime of "automatic proba- 
bilistic decision-making" , namely, by just using a (classical) random generator 
producing decisions x\ and X2 with the probabilities p b c (xi) and p b c (x2) 

Remark 1. (Comparing with classical probability) We remark that a cogni- 
tive system tql which uses the classical probabilistic processing of information 
can apply the conventional formula of total probability |T]) to predict the im- 
probabilities on the basis transition probabilities p b \ a (x\y) and a-probabilities 
Pc(y). Thus one can consider the proposed QL-scheme as simply introduction 
of an additional - interference - parameter 9 and modification of the formula 
of total probability. The main source of such a modification of the conventional 
statistical considerations is the impossibility to combine the context C with the 
selection contexts C Vj and hence to get the probabilities P(b — x\CC Vj ), cf. 
with the resolution of "Simpson's paradox" in [57] • As we have seen, a QL 
cognitive system tql cannot proceed in the same way. The formula of total 
probability with the interference term contains not only the transition probabil- 
ities and the a-probabilities, but also phases and the latter are unknown. Thus 
even by choosing e.g. prior probabilities Pc(y) (under the condition that the 
transition probabilities were obtained via the frequency experience), the tql 
could not predict 6-probabilities. 
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By using QLRA the cognitive system tql can construct for each 9 = (9 Xl , 9 X2 ) 
the complex probability amplitude ij)c,e(%)- Then the 6-probabilities can be rep- 
resented by using the Born's rule: 

p%(x) = / d0p c {6) \^c,e(x)\ 2 - (12) 



5 Bayesian updating of state distribution 

Thus by our model the brain of tql proceeds by using the mixture of classical 
and quantum of probabilities. The whole Bayesian scheme is purely classical, 
"quantumness" appears in (fl"2"]) only via Born's rule. 

However, as always, there arises the problem of the choice of prior prob- 
ability distributions. Since the transition probabilities and the a-probabilitics 
are present even in the classical Baeysian framework, only the phase distribu- 
tion pc (0) makes a new (QL) contribution. A QL cognitive system tql should 
learn itself to choose pc(&) on the basis of the previous experience of the b\a 
decision-making under the context C. Such a learning can be performed via the 
(conventional) Bayesian updating procedure. 

By combining Bayes' and Born's formulas, we get: 

pc\v\x) - _ b = -j— — — — (13) 

By following the Bayesian scheme tql would like to maximize the probabil- 
ity pc(8\x), i.e., to construct a map m : Xb — > Q,m(x) — max (x), see [56] . 
Since the denominator in (| 1 3[) does not depend on 9, this problem is reduced to 
maximization of the joint probability density pc{x, 9). 

Suppose now that under the context C the tql made the decision b — x and 
this decision was successful (so the tql got some form of reward). Then the 
tql would update the distribution pc{9) by maximizing pc(x,6). To simplify 
considerations and to extract the main QL factor, we assume that the transi- 
tion probabilities as well as the a-probabilities are fixed. So, optimization is 
considered only with respect to the interference angles 9. 

In the case of the doubly stochastic matrix of transition probabilities 9 Xl = 
+ n an d hence we can consider the one dimensional phase parameter 9. 

Example 1. (Discrete distribution of phases) Some context C is chosen. 
Suppose that the transition probabilities as well as the a-probabilities are equal 
to 1/2. Here the formula of total probability with the interference term gives: 

p c (xi\6) =coa 2 9/2- lP c(x 2 ,9) = sin 2 9/2. 

We remark that these probabilities coincide with polarization (or spin 1/2) 
probabilities obtained in QM, see e.g. [40]. It should be emphasized that 
this is really a simple coincidence of mathematical formulas. In opposite to 
e.g. [44], we do not consider physical quantum systems. We now consider 
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the simplest nontrivial case of the parametric set consisting of two points, e.g. 
6 = {#1 = n/2,02 = 7r}. So, this cognitive system reduced (on the basis of 
some information) phases under the context C to two possible angles. Hence, 
Pc(xi) = i(cos 2 71-/4 + cos 2 tt/2) = l,p c (x 2 ) = |(sin 2 7r/4 + sin 2 tt/2) = |. Thus 
under the assumption that all phases in are equally possible, this cognitive 
system tql gets that pc(x 2 ) is essentially larger than pc(xi). Hence, tql makes 
the decision b = Xi- If the result of this decision was positive (i.e. some form 
of reward was obtained), tql would like to update the state distribution. Since 
Pc(x2,tt/2) — j and pc(x 2 , tt/2) = |, the cognitive system will put (in future 
decision-making) more weight to 82 = 7r, e.g. the updated distribution could be 
Pc^/2) = |,pc(tt) = I 

Example 2. (Continuous distribution of phases) Suppose that all transi- 
tion probabilities are equal. Let us consider the uniform distribution of phases 
on 6 = [0,2tt) : d Pc {9) = ±dB. Here p C (xi,0) = ± cos 2 9/2-p c {x 2 , 9) = 
^ sin 2 6/2. Hence, pc(xi) — Pc( x 2) — 1/2. Thus the definite decision could not 
be done. 

Example 3. Suppose that all transition probabilities are equal. Let us 
consider the uniform distribution of phases on 9 = [0, 7r/2) : dpc{0) = —d0. Here 
p C (xi,e) = f cos 2 6/2-p c {x2,6) - f sin 2 9/2. Hence, p c {xi) = I + I J c (a; 2 ) = 

2 . Thus the b = x\ decision is preferred. For this decision the maximum is 

approached for 9 = 0. Therefore this cognitive system would update pc(9) by 
concentrating it at the point 9 = 0. 



6 Mixed state representation 

We remark that the former Bayesian considerations can be mathematically rep- 
resented by using mixed quantum states. Let us consider the density matrix: 

pc= d9p{9) pee- 
Je 

Pcm = ">Pc,0 ® ipc,e 

We obtain the representation: 

p b c (x) = Tr p c tt^ (14) 

where 7r^ is the orthogonal projector corresponding to the eigenvalue b — x. 
Thus quantity 

(„, \ T 1 !- n „ Jo 



p b c (xi) _ Tr pc K 



p b c {x 2 ) Tr pc 7T b 



(15) 



X2 



is used in the QL decision-making. 
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7 Comparing with standard quantum decision- 
making theory 

In this section we would like to compare our approach with standard quantum 
decision-making theory, see e.g. [3S], [ID]- [33], [33] (and references in these 
works) . : 

a) . Interpretation. The crucial difference is that our formalism is not about 
really quantum physical systems, but about QL systems. Thus we need not 
quantum sources of randomness, e.g. electrons or photons, to perform our QL 
decision making. Moreover, the essence of QL behavior is not consideration of a 
special class of systems, but of a special class of contexts or to be more precise: 
interrelation between contexts and observables. 

b) . Scheme of the decision making. We consider a specific scheme (motivated 
by PD, see appendix (section 9)) involving two supplementary ("incompatible") 
observables a and b. Moreover, in general one of them, namely, a is a generalized 
quantum observable, see section 10. 

c) . Mathematics. We consider a specific parametrization of a prior quantum 
state, namely, by the interference angle 9. 

d) . Application. We apply our model to modelling of brain's functioning as a 
macroscopic QL system or to be more precise: a macroscopic system performing 
specific interconnections between contexts and observables (inducing nontrivial 
interference) . 



8 Bayes risk 

As usual in quantum decision-making, we consider Bayes risk corresponding to 
the deviation function Wg(x), see [40], p. 46: 

n h c = [ dp{9) y2w e (x)p b c (x\0) = / d P (9) Y"W»{x) \^c,e^)\ 2 = (16) 

I dp(9) J2w e (x) Tipc,e^. 

Typically in quantum decision theory the problem of finding of Bayes decision 
rule is considered, e.g. [30], P- 46-50. However, we are not interested in this 
problem, since the decision-making operator b is considered as givenj^l 

In our model the brain is interested to minimize Bayes risk for the fixed 
observable b via variation of the prior distribution of interference phases. 

We come back to Example 1. Now we do not fix the distribution of phases 
on 9 = {6*i = 7r/2, 6*2 = it}. Here p = p(9\) and 1 — p = p(9 2 ) are parameters 
of the model. Suppose that the deviation function Wg(xi) — Sij. Thus Bayes 
risk is K b c = p Pcixx^x) + (1 - p) p b c {x 2 \9 2 ) = pcos 2 9 x /2 + (1 - p) sin 2 9 2 /2 = 

6 Of course, it could also be modified in the process of brain's functioning, but we do not 
consider this problem in the present paper. 
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p/2 + (1 — p) = 1 — p/2. Thus Bayes risk is minimal for p = 1. Hence, the 
brain would modify the prior (mixed) mental state into the (pure) mental state 

^0/2- 

9 Prisoner's Dilemma 

In game theory, PD is a type of non-zero-sum game in which two players can 
cooperate with or defect (i.e. betray) the other player. In this game, as in all 
game theory, the only concern of each individual player (prisoner) is maximizing 
his/her own payoff, without any concern for the other player's payoff. In the 
classic form of this game, cooperating is strictly dominated by defecting, so 
that the only possible equilibrium for the game is for all players to defect. In 
simpler terms, no matter what the other player does, one player will always gain 
a greater payoff by playing defect. Since in any situation playing defect is more 
beneficial than cooperating, all rational players will play defect. 

The classical PD is as follows: Two suspects, A and B, are arrested by 
the police. The police have insufficient evidence for a conviction, and, having 
separated both prisoners, visit each of them to offer the same deal: if one testifies 
for the prosecution against the other and the other remains silent, the betrayer 
goes free and the silent accomplice receives the full 10-year sentence. If both stay 
silent, both prisoners are sentenced to only six months in jail for a minor charge. 
If each betrays the other, each receives a two-year sentence. Each prisoner must 
make the choice of whether to betray the other or to remain silent. However, 
neither prisoner knows for sure what choice the other prisoner will make. So 
this dilemma poses the question: How should the prisoners act? The dilemma 
arises when one assumes that both prisoners only care about minimizing their 
own jail terms. Each prisoner has two options: to cooperate with his accomplice 
and stay quiet, or to defect from their implied pact and betray his accomplice 
in return for a lighter sentence. The outcome of each choice depends on the 
choice of the accomplice, but each prisoner must choose without knowing what 
his accomplice has chosen to do. In deciding what to do in strategic situations, 
it is normally important to predict what others will do. This is not the case 
here. If you knew the other prisoner would stay silent, your best move is to 
betray as you then walk free instead of receiving the minor sentence. If you 
knew the other prisoner would betray, your best move is still to betray, as you 
receive a lesser sentence than by silence. Betraying is a dominant strategy. The 
other prisoner reasons similarly, and therefore also chooses to betray. Yet by 
both defecting they get a lower payoff than they would get by staying silent. 
So rational, self-interested play results in each prisoner being worse off than if 
they had stayed silent, see e.g. wikipedia - "Prisoner's dilemma." The following 
mental contexts are involved in PD: 

Context C representing the situation such that a player has no idea about 
planned action of another player. Context representing the situation such 
that the B-player supposes that A will cooperate and context C" - A will 
compete. We can also consider similar contexts C±. We define dichotomous 
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observables a and b corresponding to actions of players A and B : a = + if A 
chooses to cooperate and a = — if A chooses to compete, b is defined in the 
same way. 

A priori the law of total probability might be violated for PD, since the 
£?-player is not able to combine contexts. If those contexts were represented 
by subsets of a so called space of "elementary events" as it is done in classical 
probability theory (based on Kolmogorov (1933) measure-theoretic axiomatics), 
the .B-player would be able to consider the conjunction of the contexts C and 
e.g. and to operate in the context C A C" (which would be represented by 
the set CflC"). But the very situation of PD is such that one could not expect 
that contexts C and C± might be peacefully combined. If the B-player obtains 
information about the planned action of the A-player (or even if he just decides 
that A will play in the definite way, e.g. the context C" will be realized), then 
the context C is simply destroyed. It could not be combined with C" . 

We can introduce the following contextual probabilities: p h c (±) = P{b = 
±|C) - probabilities for actions of B under the complex of mental conditions 
C. p fc l Q (±, +) = P(b = ±\C%) and p b \ a {±, -) = P(b = ±\C±) - probabilities for 
actions of B under the complexes of mental conditions C" and C°L , respectively, 
p^(±) = P(a = ±|C) - prior probabilities which B assigns for actions of A under 
the complex of mental conditions C. 



10 Appendix: Generalization of the QM formal- 
ism 

Let us consider a finite dimensional Hilbert space H. Let £ = {ej}™ =1 be an 
orthonormal basis: 

^ = c i e i ' C J = C J WO e C ' 

3 

Each £ generates a class of (conventional) quantum observables, self-adjoint 
operators, see [59], [58] : 

a^ = ^2yjC j (il))e j , (18) 

3 

where X a = {yx, y n }, € R, yj ^ yi is the range of values of a (so we start 
with consideration of observables with nondegenerate spectra) . 

Let now £ = {ej}™ =1 be an arbitrary basis (thus in general (ej, e;) 7^ 0, i 7^ j) 
consisting of normalized vectors, i.e., (ej,ej) = 10 

We generalize the Dirac-von Neumann formalism by considering observables 
(fT5|) for an arbitrary £. We also consider an arbitrary nonzero vector of H as a 

7 We remark that QLRA, see section 3, produces the a-basis with normalized vectors, 
|| ey II 2 = 1. It is a consequence of stochasticity of an arbitrary matrix of transition probabilities 
(which was used by QLRA to produce the a-basis). Thus we consider now a purely linear 
algebraic version of this situation. 
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pure quantum state. We postulate (by generalizing Born's postulate): 



p * ( "=* )= 5«' (19) 

where the coefficients Cj(ib) are given by the expansion (JTTJ) . 

If £ is an orthonormal basis, then Cj(ip) = IcjCVOI 2 = IIV'II 2 and 

for a normalized vector t/j, we obtain the ordinary Born's rule. 

Our generalization of the Dirac-von Neumann formalism is also very close 
to another well known (and very popular in QI) generalization of the class of 
quantum observables, namely, to the formalism of POVM, [53], [JO]. To proceed 
in this way, we introduce projectors on the basis vectors: TTjtp — Cj(tp)ej. We 
remark that 7r| = 7T,-, but in general ir* ^ ttj. We have: |c J ( , 0)| 2 = (irjipfirjip) — 
(Mjtp, ip), where Mj = ir*irj. We remark that each Mj is self-adjoint and, more- 
over, positively defined. We also set M = J^j^j- Then our generalization of 
Born's rule can be written as: 

where p<p = ip (E> ip. We remark that, for an arbitrary nonzero ip, the operator 

Now we generalize the conventional notion of the density operator, by con- 
sidering any nonzero p > as a generalized density operator (we recall that at 
the moment we consider a finite-dimensional space). The corresponding gener- 
alization of Born's postulate has the following form: 

P^a = yj )= TlpM i. (21) 
^ V Tr p M V ' 

The only difference from the POVM formalism is that the operator M ^ I (the 
unit operator). 

We remark that (Mij),ip) = J2j kiWI 2 OjV' 0- Thus (we are in the 
finite dimensional case) the inverse operator M~ l is well defined. 

We now proceed with our formalization and consider an arbitrary (separable) 
Hilbert space H. 

Definition 10.1. A generalized quantum state is represented by an arbitrary 
trace class nonnegative (nonzero) operator p : p > 0, < Trp < oo. 

Definition 10.2. A generalized quantum observable is represented by an 
arbitrary (so in general non normalized) positive operator valued measure E on 
a measurable space (X,T) such that E(X) > 0. 

Thus, for a generalized quantum observable E, we have: 

1) . E(B) > 0, for any set B G J 7 , and E(X) > 0; 

2) . E(U^ =1 Bj) = J2]Li E{Bj) for all disjoint sequences {B 3 } in T. 
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Generalized Born's rule: Let p and E be generalized quantum state and 
observable, respectively. Then the probability to find the result x of the E- 
measurement in a measurable set B (for an ensemble represented by p) is given 

We remark that Trp E(X) > 0. To prove this, we consider the spectral 
expansion of the trace class operator p = J^jQj^j ® ipj- Here at least one 
qj >0. ThenTrpE(X) = Ejqj<E(X)Vj,Vj) >0. 

We now come back to the model considered at the beginning of this sec- 
tion: a finite-dimensional space. We would like to model in the abstract linear 
algebra framework the situation considered in section 3. We consider two ob- 
servables, one is a conventional self-adjoint operator b and another is a gen- 
eralized observable a. Thus the 6-basis £ b = {e b } is orthonormal, but the 
a-basis £ a = {e"} need not (but we emphasize that even the latter one is 
normalized). Any vector e" is a conventional (pure) quantum state. Thus 
by the rules of the conventional QM we can find "transition probabilities": 
p b \ a (xi\yj) = P e *(b = Xi) = |(e°,e^)| 2 . Since £ b is orthonormal, we have: 

T,iP hla ( x i\Vj)) = Eil( e ". e i)| 2 = ll e ill 2 = !• The matrix of 6|a-transition prob- 
abilities P b ' a is stochastic (as it should be). However, if £ a is not orthonormal, 
then P h l a is not doubly stochastic. 

On the other hand, we can expand each e b with respect to £ a : e b — 
J2j c j( e i) e< j- By our generalized Born's rule: p a \ b (yj\xi) = P e b(a = yj) = 
M4)| 2 /Ej \cj{e b )\ 2 - We have: £V p a]b (yj\xi) = 1. Thus even the matrix of 
transition probabilities P a ' b is stochastic. 

Finally, we remark that all previous considerations are valid even in the case 
when both observables are generalized. 



11 Appendix: Von Neumann postulate in cog- 
nitive science and psychology 

In general the transition probabilities can depend on the cognitive context C 
which was chosen for the first (unconditional) measurement: 

p b ^(x\y)=p b ^(x\ y ). 

But in some cases dependence of the transition probabilities p b \ a (x\y) on C could 
be reducible. In the experimental situation these probabilities (frequencies) are 
found in the following way. First cognitive systems interact with a context C. 
In this way an ensemble Sq of cognitive systems representing the context C is 
created. Then cognitive systems belonging to the ensemble Sc interact with 
the selection-context C y which is determined by the value y of the mental ob- 
servable a. For example, students belonging to a group Sc (which was trained 



14 



under a complex of mental or social conditions C) should answer to the ques- 
tion a. If this question is so disturbing for a student u that he would totally 
forget about the previous C-training, then the transition probabilities do not 
depend on C : p b \ a (x\y). Since we are interested only in probabilities, such an 
individual blocking can be generalized to "statistical blocking" - dependence on 
C after sequential admeasurement should be statistically negligible: the num- 
ber of persons who still use the original C-context (e.g. training) to reply to 
the 6-question (following the a-question) is negligibly small comparing with the 
total number of persons in a sample Sc representing C. 

We remark that this is the case in conventional quantum theory. Here for in- 
compatible (noncomutative) observables (with nondegenerate spectra) the tran- 
sition probabilities p b \ a (x\y) = |(e^,e^)| 2 do not depend on the original context 
C, i.e., a context preceding the a — y selection (by the QM-terminology: "on 
the original wave function ip" ) . 

In quantum theory any a = y selection destroys the memory on the preceding 
physical context C. For example, suppose that we prepare electrons with a wave 
function ip (which provides symbolic symbolic representation of a context C, so 
ip — ipc)- We measure spin's projection on some direction b and then on another 
direction a. The transition probability does not depend on ip (i.e., on C). 

This is our contextual interpretation of the von Neumann projection postu- 
late Pj . 

We do not know the general situation for cognitive systems!! Our conjecture 
is thafi 

Postulate, ("von Neumann postulate for mental observable") For any pair 
a,b of supplementary mental observables the transition probability p b \ a (x\y) is 
completely determined by the preceding preparation - context C y corresponding 
to the [a — y]-selection. 

We remark that by Axiom 1 

p blb (x\x) = 1. 

Thus if "a system was prepared in the state e^.," then measurement of a would 
definitely give the value b = x. 

To proceed in our contextual framework, we could be satisfied even by a 
weaker form of this postulate - we recall that QLRA works by using only two 
"reference observables." 

Postulate. ("Weak von Neumann postulate for mental observable") There 
exist supplementary mental observables a, b such that the transition probability 
p b \ a (x\y) is completely determined by the preceding preparation - context C y 
corresponding to the [a = y]-selection. 

8 It might be that the von Neumann projection postulate can be violated by cognitive 
systems. In such a case we would not be able to construct the conventional quantum repre- 
sentation of contexts by complex probability amplitudes. 

9 We recall that we consider only observables with nondegenerate spectra. 
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